Trading off rewards and errors in multi-armed bandits
Identifieur interne : 000047 ( Main/Exploration ); précédent : 000046; suivant : 000048Trading off rewards and errors in multi-armed bandits
Auteurs : Akram Erraqabi [France] ; Alessandro Lazaric [France] ; Michal Valko [France] ; Emma Brunskill [États-Unis] ; Yun-En Liu [États-Unis]Source :
Abstract
In multi-armed bandits, the most common objective is the maximization of the cumulative reward. Alternative settings include active exploration, where a learner tries to gain accurate estimates of the rewards of all arms. While these objectives are contrasting, in many scenarios it is desirable to trade off rewards and errors. For instance, in educational games the designer wants to gather generalizable knowledge about the behavior of the students and teaching strategies (small estimation errors) but, at the same time, the system needs to avoid giving a bad experience to the players, who may leave the system permanently (large reward). In this paper, we formalize this tradeoff and introduce the ForcingBalance algorithm whose performance is provably close to the best possible tradeoff strategy. Finally, we demonstrate on real-world educational data that ForcingBalance returns useful information about the arms without compromising the overall reward.
Url:
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Hal, to step Corpus: 000659
- to stream Hal, to step Curation: 000659
- to stream Hal, to step Checkpoint: 000033
- to stream Main, to step Merge: 000046
- to stream Main, to step Curation: 000047
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Trading off rewards and errors in multi-armed bandits</title>
<author><name sortKey="Erraqabi, Akram" sort="Erraqabi, Akram" uniqKey="Erraqabi A" first="Akram" last="Erraqabi">Akram Erraqabi</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-432036" status="VALID"> <idno type="RNSR">200718281V</idno>
<orgName>Sequential Learning</orgName>
<orgName type="acronym">SEQUEL</orgName>
<date type="start">2015-01-01</date>
<desc> <address> <country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/sequel</ref>
</desc>
<listRelation> <relation active="#struct-104752" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-410272" type="direct"></relation>
<relation active="#struct-120930" type="indirect"></relation>
<relation active="#struct-92973" type="indirect"></relation>
<relation active="#struct-301700" type="indirect"></relation>
<relation name="UMR9189" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-302102" type="indirect"></relation>
</listRelation>
<tutelles><tutelle active="#struct-104752" type="direct"><org type="laboratory" xml:id="struct-104752" status="VALID"> <idno type="RNSR">200818245B</idno>
<orgName>Inria Lille - Nord Europe</orgName>
<desc> <address> <addrLine>Parc Scientifique de la Haute Borne 40, avenue Halley Bât.A, Park Plaza 59650 Villeneuve d'Ascq</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/lille/</ref>
</desc>
<listRelation> <relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect"><org type="institution" xml:id="struct-300009" status="VALID"><orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc><address><addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-410272" type="direct"><org type="laboratory" xml:id="struct-410272" status="VALID"> <idno type="RNSR">201521249L</idno>
<idno type="IdRef">18388695X</idno>
<orgName>Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189</orgName>
<orgName type="acronym">CRIStAL</orgName>
<date type="start">2015-01-01</date>
<desc> <address> <addrLine>Bâtiment M3, Université Lille 1, 59655 Villeneuve d'Ascq Cedex FRANCE </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.cristal.univ-lille.fr</ref>
</desc>
<listRelation> <relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-120930" type="direct"></relation>
<relation active="#struct-92973" type="direct"></relation>
<relation active="#struct-301700" type="direct"></relation>
<relation name="UMR9189" active="#struct-441569" type="direct"></relation>
<relation active="#struct-302102" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-120930" type="indirect"><org type="institution" xml:id="struct-120930" status="VALID"> <orgName>Ecole Centrale de Lille</orgName>
<desc> <address> <addrLine>Cité Scientifique - CS 20048 59651 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.ec-lille.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-92973" type="indirect"><org type="institution" xml:id="struct-92973" status="VALID"> <idno type="IdRef">026404184</idno>
<orgName>Université de Lille, Sciences et Technologies</orgName>
<desc> <address> <addrLine>Cité Scientifique - 59655 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille1.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301700" type="indirect"><org type="institution" xml:id="struct-301700" status="VALID"> <idno type="IdRef">026404524</idno>
<idno type="ISNI">0000000121517701</idno>
<orgName>Université de Lille, Sciences Humaines et Sociales</orgName>
<desc> <address> <addrLine>Domaine universitaire du "Pont de Bois"Rue du Barreau BP 60149 59653 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille3.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR9189" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"> <idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc> <address> <country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-302102" type="indirect"><org type="institution" xml:id="struct-302102" status="VALID"> <orgName>Institut Mines-Télécom [Paris]</orgName>
<desc> <address> <addrLine>37-39 Rue Dareau, 75014 Paris</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.mines-telecom.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Lazaric, Alessandro" sort="Lazaric, Alessandro" uniqKey="Lazaric A" first="Alessandro" last="Lazaric">Alessandro Lazaric</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-432036" status="VALID"> <idno type="RNSR">200718281V</idno>
<orgName>Sequential Learning</orgName>
<orgName type="acronym">SEQUEL</orgName>
<date type="start">2015-01-01</date>
<desc> <address> <country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/sequel</ref>
</desc>
<listRelation> <relation active="#struct-104752" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-410272" type="direct"></relation>
<relation active="#struct-120930" type="indirect"></relation>
<relation active="#struct-92973" type="indirect"></relation>
<relation active="#struct-301700" type="indirect"></relation>
<relation name="UMR9189" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-302102" type="indirect"></relation>
</listRelation>
<tutelles><tutelle active="#struct-104752" type="direct"><org type="laboratory" xml:id="struct-104752" status="VALID"> <idno type="RNSR">200818245B</idno>
<orgName>Inria Lille - Nord Europe</orgName>
<desc> <address> <addrLine>Parc Scientifique de la Haute Borne 40, avenue Halley Bât.A, Park Plaza 59650 Villeneuve d'Ascq</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/lille/</ref>
</desc>
<listRelation> <relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect"><org type="institution" xml:id="struct-300009" status="VALID"><orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc><address><addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-410272" type="direct"><org type="laboratory" xml:id="struct-410272" status="VALID"> <idno type="RNSR">201521249L</idno>
<idno type="IdRef">18388695X</idno>
<orgName>Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189</orgName>
<orgName type="acronym">CRIStAL</orgName>
<date type="start">2015-01-01</date>
<desc> <address> <addrLine>Bâtiment M3, Université Lille 1, 59655 Villeneuve d'Ascq Cedex FRANCE </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.cristal.univ-lille.fr</ref>
</desc>
<listRelation> <relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-120930" type="direct"></relation>
<relation active="#struct-92973" type="direct"></relation>
<relation active="#struct-301700" type="direct"></relation>
<relation name="UMR9189" active="#struct-441569" type="direct"></relation>
<relation active="#struct-302102" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-120930" type="indirect"><org type="institution" xml:id="struct-120930" status="VALID"> <orgName>Ecole Centrale de Lille</orgName>
<desc> <address> <addrLine>Cité Scientifique - CS 20048 59651 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.ec-lille.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-92973" type="indirect"><org type="institution" xml:id="struct-92973" status="VALID"> <idno type="IdRef">026404184</idno>
<orgName>Université de Lille, Sciences et Technologies</orgName>
<desc> <address> <addrLine>Cité Scientifique - 59655 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille1.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301700" type="indirect"><org type="institution" xml:id="struct-301700" status="VALID"> <idno type="IdRef">026404524</idno>
<idno type="ISNI">0000000121517701</idno>
<orgName>Université de Lille, Sciences Humaines et Sociales</orgName>
<desc> <address> <addrLine>Domaine universitaire du "Pont de Bois"Rue du Barreau BP 60149 59653 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille3.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR9189" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"> <idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc> <address> <country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-302102" type="indirect"><org type="institution" xml:id="struct-302102" status="VALID"> <orgName>Institut Mines-Télécom [Paris]</orgName>
<desc> <address> <addrLine>37-39 Rue Dareau, 75014 Paris</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.mines-telecom.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Valko, Michal" sort="Valko, Michal" uniqKey="Valko M" first="Michal" last="Valko">Michal Valko</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-432036" status="VALID"> <idno type="RNSR">200718281V</idno>
<orgName>Sequential Learning</orgName>
<orgName type="acronym">SEQUEL</orgName>
<date type="start">2015-01-01</date>
<desc> <address> <country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/sequel</ref>
</desc>
<listRelation> <relation active="#struct-104752" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-410272" type="direct"></relation>
<relation active="#struct-120930" type="indirect"></relation>
<relation active="#struct-92973" type="indirect"></relation>
<relation active="#struct-301700" type="indirect"></relation>
<relation name="UMR9189" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-302102" type="indirect"></relation>
</listRelation>
<tutelles><tutelle active="#struct-104752" type="direct"><org type="laboratory" xml:id="struct-104752" status="VALID"> <idno type="RNSR">200818245B</idno>
<orgName>Inria Lille - Nord Europe</orgName>
<desc> <address> <addrLine>Parc Scientifique de la Haute Borne 40, avenue Halley Bât.A, Park Plaza 59650 Villeneuve d'Ascq</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/lille/</ref>
</desc>
<listRelation> <relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect"><org type="institution" xml:id="struct-300009" status="VALID"><orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc><address><addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-410272" type="direct"><org type="laboratory" xml:id="struct-410272" status="VALID"> <idno type="RNSR">201521249L</idno>
<idno type="IdRef">18388695X</idno>
<orgName>Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189</orgName>
<orgName type="acronym">CRIStAL</orgName>
<date type="start">2015-01-01</date>
<desc> <address> <addrLine>Bâtiment M3, Université Lille 1, 59655 Villeneuve d'Ascq Cedex FRANCE </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.cristal.univ-lille.fr</ref>
</desc>
<listRelation> <relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-120930" type="direct"></relation>
<relation active="#struct-92973" type="direct"></relation>
<relation active="#struct-301700" type="direct"></relation>
<relation name="UMR9189" active="#struct-441569" type="direct"></relation>
<relation active="#struct-302102" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-120930" type="indirect"><org type="institution" xml:id="struct-120930" status="VALID"> <orgName>Ecole Centrale de Lille</orgName>
<desc> <address> <addrLine>Cité Scientifique - CS 20048 59651 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.ec-lille.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-92973" type="indirect"><org type="institution" xml:id="struct-92973" status="VALID"> <idno type="IdRef">026404184</idno>
<orgName>Université de Lille, Sciences et Technologies</orgName>
<desc> <address> <addrLine>Cité Scientifique - 59655 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille1.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301700" type="indirect"><org type="institution" xml:id="struct-301700" status="VALID"> <idno type="IdRef">026404524</idno>
<idno type="ISNI">0000000121517701</idno>
<orgName>Université de Lille, Sciences Humaines et Sociales</orgName>
<desc> <address> <addrLine>Domaine universitaire du "Pont de Bois"Rue du Barreau BP 60149 59653 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille3.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR9189" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"> <idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc> <address> <country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-302102" type="indirect"><org type="institution" xml:id="struct-302102" status="VALID"> <orgName>Institut Mines-Télécom [Paris]</orgName>
<desc> <address> <addrLine>37-39 Rue Dareau, 75014 Paris</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.mines-telecom.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Brunskill, Emma" sort="Brunskill, Emma" uniqKey="Brunskill E" first="Emma" last="Brunskill">Emma Brunskill</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-87723" status="VALID"> <orgName>Computer Science Department - Carnegie Mellon University</orgName>
<desc> <address> <addrLine>Computer Science Department Carnegie Mellon University Pittsburgh, PA</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.cs.cmu.edu/</ref>
</desc>
<listRelation> <relation active="#struct-378064" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-378064" type="direct"><org type="institution" xml:id="struct-378064" status="INCOMING"> <orgName>University of Pittsburgh</orgName>
<desc> <address> <country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>États-Unis</country>
<placeName><settlement type="city">Pittsburgh</settlement>
<region type="state">Pennsylvanie</region>
</placeName>
<orgName type="university">Université de Pittsburgh</orgName>
</affiliation>
</author>
<author><name sortKey="Liu, Yun En" sort="Liu, Yun En" uniqKey="Liu Y" first="Yun-En" last="Liu">Yun-En Liu</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-87723" status="VALID"> <orgName>Computer Science Department - Carnegie Mellon University</orgName>
<desc> <address> <addrLine>Computer Science Department Carnegie Mellon University Pittsburgh, PA</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.cs.cmu.edu/</ref>
</desc>
<listRelation> <relation active="#struct-378064" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-378064" type="direct"><org type="institution" xml:id="struct-378064" status="INCOMING"> <orgName>University of Pittsburgh</orgName>
<desc> <address> <country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>États-Unis</country>
<placeName><settlement type="city">Pittsburgh</settlement>
<region type="state">Pennsylvanie</region>
</placeName>
<orgName type="university">Université de Pittsburgh</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-01482765</idno>
<idno type="halId">hal-01482765</idno>
<idno type="halUri">https://hal.inria.fr/hal-01482765</idno>
<idno type="url">https://hal.inria.fr/hal-01482765</idno>
<date when="2017">2017</date>
<idno type="wicri:Area/Hal/Corpus">000659</idno>
<idno type="wicri:Area/Hal/Curation">000659</idno>
<idno type="wicri:Area/Hal/Checkpoint">000033</idno>
<idno type="wicri:explorRef" wicri:stream="Hal" wicri:step="Checkpoint">000033</idno>
<idno type="wicri:Area/Main/Merge">000046</idno>
<idno type="wicri:Area/Main/Curation">000047</idno>
<idno type="wicri:Area/Main/Exploration">000047</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Trading off rewards and errors in multi-armed bandits</title>
<author><name sortKey="Erraqabi, Akram" sort="Erraqabi, Akram" uniqKey="Erraqabi A" first="Akram" last="Erraqabi">Akram Erraqabi</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-432036" status="VALID"> <idno type="RNSR">200718281V</idno>
<orgName>Sequential Learning</orgName>
<orgName type="acronym">SEQUEL</orgName>
<date type="start">2015-01-01</date>
<desc> <address> <country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/sequel</ref>
</desc>
<listRelation> <relation active="#struct-104752" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-410272" type="direct"></relation>
<relation active="#struct-120930" type="indirect"></relation>
<relation active="#struct-92973" type="indirect"></relation>
<relation active="#struct-301700" type="indirect"></relation>
<relation name="UMR9189" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-302102" type="indirect"></relation>
</listRelation>
<tutelles><tutelle active="#struct-104752" type="direct"><org type="laboratory" xml:id="struct-104752" status="VALID"> <idno type="RNSR">200818245B</idno>
<orgName>Inria Lille - Nord Europe</orgName>
<desc> <address> <addrLine>Parc Scientifique de la Haute Borne 40, avenue Halley Bât.A, Park Plaza 59650 Villeneuve d'Ascq</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/lille/</ref>
</desc>
<listRelation> <relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect"><org type="institution" xml:id="struct-300009" status="VALID"><orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc><address><addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-410272" type="direct"><org type="laboratory" xml:id="struct-410272" status="VALID"> <idno type="RNSR">201521249L</idno>
<idno type="IdRef">18388695X</idno>
<orgName>Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189</orgName>
<orgName type="acronym">CRIStAL</orgName>
<date type="start">2015-01-01</date>
<desc> <address> <addrLine>Bâtiment M3, Université Lille 1, 59655 Villeneuve d'Ascq Cedex FRANCE </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.cristal.univ-lille.fr</ref>
</desc>
<listRelation> <relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-120930" type="direct"></relation>
<relation active="#struct-92973" type="direct"></relation>
<relation active="#struct-301700" type="direct"></relation>
<relation name="UMR9189" active="#struct-441569" type="direct"></relation>
<relation active="#struct-302102" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-120930" type="indirect"><org type="institution" xml:id="struct-120930" status="VALID"> <orgName>Ecole Centrale de Lille</orgName>
<desc> <address> <addrLine>Cité Scientifique - CS 20048 59651 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.ec-lille.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-92973" type="indirect"><org type="institution" xml:id="struct-92973" status="VALID"> <idno type="IdRef">026404184</idno>
<orgName>Université de Lille, Sciences et Technologies</orgName>
<desc> <address> <addrLine>Cité Scientifique - 59655 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille1.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301700" type="indirect"><org type="institution" xml:id="struct-301700" status="VALID"> <idno type="IdRef">026404524</idno>
<idno type="ISNI">0000000121517701</idno>
<orgName>Université de Lille, Sciences Humaines et Sociales</orgName>
<desc> <address> <addrLine>Domaine universitaire du "Pont de Bois"Rue du Barreau BP 60149 59653 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille3.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR9189" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"> <idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc> <address> <country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-302102" type="indirect"><org type="institution" xml:id="struct-302102" status="VALID"> <orgName>Institut Mines-Télécom [Paris]</orgName>
<desc> <address> <addrLine>37-39 Rue Dareau, 75014 Paris</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.mines-telecom.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Lazaric, Alessandro" sort="Lazaric, Alessandro" uniqKey="Lazaric A" first="Alessandro" last="Lazaric">Alessandro Lazaric</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-432036" status="VALID"> <idno type="RNSR">200718281V</idno>
<orgName>Sequential Learning</orgName>
<orgName type="acronym">SEQUEL</orgName>
<date type="start">2015-01-01</date>
<desc> <address> <country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/sequel</ref>
</desc>
<listRelation> <relation active="#struct-104752" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-410272" type="direct"></relation>
<relation active="#struct-120930" type="indirect"></relation>
<relation active="#struct-92973" type="indirect"></relation>
<relation active="#struct-301700" type="indirect"></relation>
<relation name="UMR9189" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-302102" type="indirect"></relation>
</listRelation>
<tutelles><tutelle active="#struct-104752" type="direct"><org type="laboratory" xml:id="struct-104752" status="VALID"> <idno type="RNSR">200818245B</idno>
<orgName>Inria Lille - Nord Europe</orgName>
<desc> <address> <addrLine>Parc Scientifique de la Haute Borne 40, avenue Halley Bât.A, Park Plaza 59650 Villeneuve d'Ascq</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/lille/</ref>
</desc>
<listRelation> <relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect"><org type="institution" xml:id="struct-300009" status="VALID"><orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc><address><addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-410272" type="direct"><org type="laboratory" xml:id="struct-410272" status="VALID"> <idno type="RNSR">201521249L</idno>
<idno type="IdRef">18388695X</idno>
<orgName>Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189</orgName>
<orgName type="acronym">CRIStAL</orgName>
<date type="start">2015-01-01</date>
<desc> <address> <addrLine>Bâtiment M3, Université Lille 1, 59655 Villeneuve d'Ascq Cedex FRANCE </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.cristal.univ-lille.fr</ref>
</desc>
<listRelation> <relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-120930" type="direct"></relation>
<relation active="#struct-92973" type="direct"></relation>
<relation active="#struct-301700" type="direct"></relation>
<relation name="UMR9189" active="#struct-441569" type="direct"></relation>
<relation active="#struct-302102" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-120930" type="indirect"><org type="institution" xml:id="struct-120930" status="VALID"> <orgName>Ecole Centrale de Lille</orgName>
<desc> <address> <addrLine>Cité Scientifique - CS 20048 59651 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.ec-lille.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-92973" type="indirect"><org type="institution" xml:id="struct-92973" status="VALID"> <idno type="IdRef">026404184</idno>
<orgName>Université de Lille, Sciences et Technologies</orgName>
<desc> <address> <addrLine>Cité Scientifique - 59655 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille1.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301700" type="indirect"><org type="institution" xml:id="struct-301700" status="VALID"> <idno type="IdRef">026404524</idno>
<idno type="ISNI">0000000121517701</idno>
<orgName>Université de Lille, Sciences Humaines et Sociales</orgName>
<desc> <address> <addrLine>Domaine universitaire du "Pont de Bois"Rue du Barreau BP 60149 59653 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille3.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR9189" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"> <idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc> <address> <country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-302102" type="indirect"><org type="institution" xml:id="struct-302102" status="VALID"> <orgName>Institut Mines-Télécom [Paris]</orgName>
<desc> <address> <addrLine>37-39 Rue Dareau, 75014 Paris</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.mines-telecom.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Valko, Michal" sort="Valko, Michal" uniqKey="Valko M" first="Michal" last="Valko">Michal Valko</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-432036" status="VALID"> <idno type="RNSR">200718281V</idno>
<orgName>Sequential Learning</orgName>
<orgName type="acronym">SEQUEL</orgName>
<date type="start">2015-01-01</date>
<desc> <address> <country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/sequel</ref>
</desc>
<listRelation> <relation active="#struct-104752" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-410272" type="direct"></relation>
<relation active="#struct-120930" type="indirect"></relation>
<relation active="#struct-92973" type="indirect"></relation>
<relation active="#struct-301700" type="indirect"></relation>
<relation name="UMR9189" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-302102" type="indirect"></relation>
</listRelation>
<tutelles><tutelle active="#struct-104752" type="direct"><org type="laboratory" xml:id="struct-104752" status="VALID"> <idno type="RNSR">200818245B</idno>
<orgName>Inria Lille - Nord Europe</orgName>
<desc> <address> <addrLine>Parc Scientifique de la Haute Borne 40, avenue Halley Bât.A, Park Plaza 59650 Villeneuve d'Ascq</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/lille/</ref>
</desc>
<listRelation> <relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect"><org type="institution" xml:id="struct-300009" status="VALID"><orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc><address><addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-410272" type="direct"><org type="laboratory" xml:id="struct-410272" status="VALID"> <idno type="RNSR">201521249L</idno>
<idno type="IdRef">18388695X</idno>
<orgName>Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189</orgName>
<orgName type="acronym">CRIStAL</orgName>
<date type="start">2015-01-01</date>
<desc> <address> <addrLine>Bâtiment M3, Université Lille 1, 59655 Villeneuve d'Ascq Cedex FRANCE </addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.cristal.univ-lille.fr</ref>
</desc>
<listRelation> <relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-120930" type="direct"></relation>
<relation active="#struct-92973" type="direct"></relation>
<relation active="#struct-301700" type="direct"></relation>
<relation name="UMR9189" active="#struct-441569" type="direct"></relation>
<relation active="#struct-302102" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-120930" type="indirect"><org type="institution" xml:id="struct-120930" status="VALID"> <orgName>Ecole Centrale de Lille</orgName>
<desc> <address> <addrLine>Cité Scientifique - CS 20048 59651 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.ec-lille.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-92973" type="indirect"><org type="institution" xml:id="struct-92973" status="VALID"> <idno type="IdRef">026404184</idno>
<orgName>Université de Lille, Sciences et Technologies</orgName>
<desc> <address> <addrLine>Cité Scientifique - 59655 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille1.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301700" type="indirect"><org type="institution" xml:id="struct-301700" status="VALID"> <idno type="IdRef">026404524</idno>
<idno type="ISNI">0000000121517701</idno>
<orgName>Université de Lille, Sciences Humaines et Sociales</orgName>
<desc> <address> <addrLine>Domaine universitaire du "Pont de Bois"Rue du Barreau BP 60149 59653 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille3.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR9189" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"> <idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc> <address> <country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-302102" type="indirect"><org type="institution" xml:id="struct-302102" status="VALID"> <orgName>Institut Mines-Télécom [Paris]</orgName>
<desc> <address> <addrLine>37-39 Rue Dareau, 75014 Paris</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.mines-telecom.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Brunskill, Emma" sort="Brunskill, Emma" uniqKey="Brunskill E" first="Emma" last="Brunskill">Emma Brunskill</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-87723" status="VALID"> <orgName>Computer Science Department - Carnegie Mellon University</orgName>
<desc> <address> <addrLine>Computer Science Department Carnegie Mellon University Pittsburgh, PA</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.cs.cmu.edu/</ref>
</desc>
<listRelation> <relation active="#struct-378064" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-378064" type="direct"><org type="institution" xml:id="struct-378064" status="INCOMING"> <orgName>University of Pittsburgh</orgName>
<desc> <address> <country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>États-Unis</country>
<placeName><settlement type="city">Pittsburgh</settlement>
<region type="state">Pennsylvanie</region>
</placeName>
<orgName type="university">Université de Pittsburgh</orgName>
</affiliation>
</author>
<author><name sortKey="Liu, Yun En" sort="Liu, Yun En" uniqKey="Liu Y" first="Yun-En" last="Liu">Yun-En Liu</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-87723" status="VALID"> <orgName>Computer Science Department - Carnegie Mellon University</orgName>
<desc> <address> <addrLine>Computer Science Department Carnegie Mellon University Pittsburgh, PA</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.cs.cmu.edu/</ref>
</desc>
<listRelation> <relation active="#struct-378064" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-378064" type="direct"><org type="institution" xml:id="struct-378064" status="INCOMING"> <orgName>University of Pittsburgh</orgName>
<desc> <address> <country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>États-Unis</country>
<placeName><settlement type="city">Pittsburgh</settlement>
<region type="state">Pennsylvanie</region>
</placeName>
<orgName type="university">Université de Pittsburgh</orgName>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">In multi-armed bandits, the most common objective is the maximization of the cumulative reward. Alternative settings include active exploration, where a learner tries to gain accurate estimates of the rewards of all arms. While these objectives are contrasting, in many scenarios it is desirable to trade off rewards and errors. For instance, in educational games the designer wants to gather generalizable knowledge about the behavior of the students and teaching strategies (small estimation errors) but, at the same time, the system needs to avoid giving a bad experience to the players, who may leave the system permanently (large reward). In this paper, we formalize this tradeoff and introduce the ForcingBalance algorithm whose performance is provably close to the best possible tradeoff strategy. Finally, we demonstrate on real-world educational data that ForcingBalance returns useful information about the arms without compromising the overall reward.</div>
</front>
</TEI>
<affiliations><list><country><li>France</li>
<li>États-Unis</li>
</country>
<region><li>Pennsylvanie</li>
</region>
<settlement><li>Pittsburgh</li>
</settlement>
<orgName><li>Université de Pittsburgh</li>
</orgName>
</list>
<tree><country name="France"><noRegion><name sortKey="Erraqabi, Akram" sort="Erraqabi, Akram" uniqKey="Erraqabi A" first="Akram" last="Erraqabi">Akram Erraqabi</name>
</noRegion>
<name sortKey="Lazaric, Alessandro" sort="Lazaric, Alessandro" uniqKey="Lazaric A" first="Alessandro" last="Lazaric">Alessandro Lazaric</name>
<name sortKey="Valko, Michal" sort="Valko, Michal" uniqKey="Valko M" first="Michal" last="Valko">Michal Valko</name>
</country>
<country name="États-Unis"><region name="Pennsylvanie"><name sortKey="Brunskill, Emma" sort="Brunskill, Emma" uniqKey="Brunskill E" first="Emma" last="Brunskill">Emma Brunskill</name>
</region>
<name sortKey="Liu, Yun En" sort="Liu, Yun En" uniqKey="Liu Y" first="Yun-En" last="Liu">Yun-En Liu</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Amérique/explor/PittsburghV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000047 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000047 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Amérique |area= PittsburghV1 |flux= Main |étape= Exploration |type= RBID |clé= Hal:hal-01482765 |texte= Trading off rewards and errors in multi-armed bandits }}
This area was generated with Dilib version V0.6.38. |